Goto

Collaborating Authors

 transformer factor


From attention to profit: quantitative trading strategy based on transformer

arXiv.org Artificial Intelligence

In traditional quantitative trading practice, navigating the complicated and dynamic financial market presents a persistent challenge. Former machine learning approaches have struggled to fully capture various market variables, often ignore long-term information and fail to catch up with essential signals that may lead the profit. This paper introduces an enhanced transformer architecture and designs a novel factor based on the model. By transfer learning from sentiment analysis, the proposed model not only exploits its original inherent advantages in capturing long-range dependencies and modelling complex data relationships but is also able to solve tasks with numerical inputs and accurately forecast future returns over a period. This work collects more than 5,000,000 rolling data of 4,601 stocks in the Chinese capital market from 2010 to 2019. The results of this study demonstrated the model's superior performance in predicting stock trends compared with other 100 factor-based quantitative strategies with lower turnover rates and a more robust half-life period. Notably, the model's innovative use transformer to establish factors, in conjunction with market sentiment information, has been shown to enhance the accuracy of trading signals significantly, thereby offering promising implications for the future of quantitative trading strategies.


Explaining black box text modules in natural language with language models

arXiv.org Artificial Intelligence

Large language models (LLMs) have demonstrated remarkable prediction performance for a growing array of tasks. However, their rapid proliferation and increasing opaqueness have created a growing need for interpretability. Here, we ask whether we can automatically obtain natural language explanations for black box text modules. A "text module" is any function that maps text to a scalar continuous value, such as a submodule within an LLM or a fitted model of a brain region. "Black box" indicates that we only have access to the module's inputs/outputs. We introduce Summarize and Score (SASC), a method that takes in a text module and returns a natural language explanation of the module's selectivity along with a score for how reliable the explanation is. We study SASC in 3 contexts. First, we evaluate SASC on synthetic modules and find that it often recovers ground truth explanations. Second, we use SASC to explain modules found within a pre-trained BERT model, enabling inspection of the model's internals. Finally, we show that SASC can generate explanations for the response of individual fMRI voxels to language stimuli, with potential applications to fine-grained brain mapping. All code for using SASC and reproducing results is made available on Github.


Transformer visualization via dictionary learning: contextualized embedding as a linear superposition of transformer factors

arXiv.org Artificial Intelligence

Transformer networks have revolutionized NLP representation learning since they were introduced. Though a great effort has been made to explain the representation in transformers, it is widely recognized that our understanding is not sufficient. One important reason is that there lack enough visualization tools for detailed analysis. In this paper, we propose to use dictionary learning to open up these "black boxes" as linear superpositions of transformer factors. Through visualization, we demonstrate the hierarchical semantic structures captured by the transformer factors, e.g., word-level polysemy disambiguation, sentence-level pattern formation, and long-range dependency. While some of these patterns confirm the conventional prior linguistic knowledge, the rest are relatively unexpected, which may provide new insights. We hope this visualization tool can bring further knowledge and a better understanding of how transformer networks work. The code is available at https://github.com/zeyuyun1/TransformerVis


Yann LeCun Team Uses Dictionary Learning To Peek Into Transformers' Black Boxes

#artificialintelligence

Transformer architectures have become the building blocks for many state-of-the-art natural language processing (NLP) models. While transformers are certainly powerful, researchers' understanding of how they actually work remains limited. This is problematic due to the lack of transparency and the possibility of biases being inherited via training data and algorithms, which could cause models to produce unfair or incorrect predictions. In the paper Transformer Visualization via Dictionary Learning: Contextualized Embedding as a Linear Superposition of Transformer Factors, a Yann LeCun team from Facebook AI Research, UC Berkeley and New York University leverages dictionary learning techniques to provide detailed visualizations of transformer representations and insights into the semantic structures -- such as word-level disambiguation, sentence-level pattern formation, and long-range dependencies -- that are captured by transformers. Previous attempts to visualize and analyze this "black box" issue in transformers include direct visualization and, more recently, "probing tasks" designed to interpret transformer models.